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Abstract. We use a multi-color classification method intro- 
duced by Wolf, Meisenheimer & Roser (2000) to reliably iden- 
tify stars, galaxies and quasars in the up to 16-dimensional 
color space provided by the filter set of the Calar Alto Deep 
Imaging Survey (CADIS). The samples of stars, galaxies and 
quasars obtained this way have been used for dedicated studies 
which are published in separate papers. 

The classification is good enough to detect quasars rather 
completely and efficiently without confirmative spectroscopy. 
The multi-color redshifts are accurate enough for most statis- 
tical applications, e.g. evolutionary studies of the galaxy lumi- 
nosity function. Also, the separation between stars and galaxies 
reaches deeper than with morphological criteria, so that studies 
of the stellar population can be extended to fainter levels. 

We characterize the dataset presently available on the 
CADIS lh-, 9h- and 16h-fields. Using Monte-Carlo simula- 
tions we model the classification performance expected for 
CADIS. We present a summary of the classification results 
on the CADIS database and discuss unclassified objects. More 
than 99% of the whole catalog sample at R < 22 (more than 
95% at R < 23) are successfully classified matching the ex- 
pectations derived from the simulations. A small number of 
peculiar objects challenging the classification are discussed in 
detail. 

Spectroscopic observations are used to check the reliability 
of the multi-color classification (6 mistakes among 151 objects 
with R < 24). From these, we also determine the accuracy 
of the multi-color redshifts which are rather good for galaxies 
(er 2 w 0.03) and useful for quasars. We find that the classifica- 
tion performance derived from the simulations compares well 
with results from the real survey. Finally, we locate areas for 
potential improvement of the classification. 

Key words: Methods: data analysis - Methods: statistical - 
Techniques: photometric - Surveys 



* Partly based on observation collected at the European Southern 
Observatory, Paranal, Chile (ESO Programmes 64.O-0401 



1. Introduction 

The Calar Alto Deep Imaging Survey (CADIS) is an extra- 
galactic key project at the Max-Planck Institut fur Astronomie 
(MPIA), Heidelberg, which is aiming at two types of objec- 
tives: CADIS investigates whole samples of different object 
classes using statistical tools, but it also searches for individ- 
ual rare and faint objects, which will be studied in detail with 
coming large telescopes. As a pencil beam survey, it probes 
seven different fields at galactic latitudes b <; 45° with a total 
area of ~ 0.25°. 

The final object catalog will arise from two fundamentally 
different survey techniques: 

- a multi-color survey with B, R, J and K' plus 13 medium- 
band filters from 400 nm to lOOOnm, practically resem- 
bling low-resolution imaging spectroscopy and giving a 
complete list of objects with R ^ 23, 

- and an emission-line survey using an imaging Fabry-Perot 
interferometer to probe emission line galaxies down to a 
limiting line flux of ~ 3 x lO~ 2t> Wm~ 2 . 

Presently, the data for three fields are reduced and have 
been used for a number of application studies published al- 
ready or to be published this year. For many applications, ob- 
jects of concern are selected by our multi-color classification 
which uses the many bands to sort the objects into stars, galax- 
ies and quasars. Also, multi-color redshifts are estimated for 
the extragalactic objects. This classification scheme was origi- 
nally developed on the basis of CADIS data, but is meanwhile 
used in a range of different survey activities. It uses a library of 
~ 65000 templates and achieves high classification reliability 
(> 90% correct classification in each class) and a high redshift 
accuracy of <r z ks 0.03 for galaxies and a z rj 0.1 for quasars. 
The methodogical background for the classification was pub- 
lished by Wolf, Meisenheimer & Roser (2000), hereafter paper 
I. There, the classification method is derived from statistical 
principles, the libraries are defined, the performance expected 
with different filter sets is compared and where conclusions are 
drawn for optimum survey strategies. 
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The purpose of this paper is to characterize the present data 
of the CADIS multi-color survey and to discuss the classifica- 
tion performance which was checked by a subsample of ob- 
jects with spectroscopic identifications. The paper is organized 
as follows: Section 2 lists CADIS goals for which the clas- 
sification is relevant and discusses what kind of objects the 
classifications should be prepared for. Section 3 defines the 
present CADIS dataset and characterizes its photometry and 
calibration. Section 4 outlines the classification method and 
presents statistics of classified and unclassified objects. Sec- 
tion 5 discusses the classification performance using a spec- 
troscopic cross check sample. Finally, Section 6 summarizes 
the quality of the classified catalogs and evaluates the prac- 
tical performance of the CADIS multi-color classification. A 
few identified peculiar objects challenging the classification are 
discussed in the appendix. 

2. Objects and objectives in CADIS 

Since the object classification in CADIS is based on colors, we 
like to ensure that the measured colors are accurate and not 
reddenned by interstellar dust absorption in our Galaxy. There- 
fore, all CADIS fields are placed in zero-reddenning areas, i.e. 
local minima of the IRAS lOO^i maps with undetected fluxes 
(< 2 MJy/sterad). A second obvious advantage of this choice 
is increased depth for extragalactic work (see Table [l] for coor- 
dinates of the three CADIS fields included in this paper). 

Although we know what objects we are looking for, we 
have to carefully determine which objects we might be con- 
fronted with in order to check whether a color-based classifica- 
tion can deal with it. A CADIS field measures at least 10' x 10' 
in size when observed with MOSCA at the 3.5-m-telescope 
on Calar Alto. Most observations used CAFOS at the 2.2-m- 
telescope and cover a larger round field of 14' diameter and 
154D' area. On the minimum area of 10' x 10' we expect the 
following objects in a typical CADIS field: 

- Galaxies are the most abundant object class in the CADIS 
fields. Published number coun ts let us expect som e 750 
galaxies with R < 23 per field ( [Metcalfe et al. 1995| ). 

- Stars should be the second-most common objects. Accord- 
ing to models of the Milky Way we expect 100 to 150 stars 
with R < 23 per field depending on its coordinates (Bah- 
call & Soneira 1981). Most stars should be late-type main 
sequence stars, and only few should be of other type: We 
expect two white dwarfs with R < 23 by assuming their 
spatial distribution to follow the general thin plus thick disk 
population of main sequence stars and by using their lo- 



Table 1. Positions of the CADIS field centers on the sky (±5") 
and number of filters observed, see Tab. 2 for details: 



cal luminosity function (Bessell & Stringfellow 1993). Fur- 



thermore, we expect 0.15 red disk giants with V > 12 
per field area ( ^ahcall & Soneira 1981 ). Also, halo giants 
should be negligible in the interesting CADIS magnitude 
range of 18 ^ R ^ 23. While T dwarfs are not to be ex- 
pected with R < 23 in such a small survey area at high 
galactic latitudes, L dwarfs could be present in low num- 
bers. In fact, CADIS has iden tified already an LI -dwarf 
with / = 22.5 undetected in R ( jWolfet al. 1991 ). 
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- Active galaxies with broad-line emission spectra are sup- 
posedly the third-most common object population in the 
CADIS fields. At B < 23.5 we expect 12 objects per 
field, i.e. about eight Seyfert-1 galaxies and four quasars 



(Hartwick & Schade 199C), with the most distant objects 
residing out to z 55 3. 

- The total space density of Seyfert-2 galaxies is estimated to 
be four to eight times higher than Seyfert-1 galaxies (Wolt- 
jer 1990). We do not have estimates for their abundance 
in the CADIS fields. They should be visible in the redshift 
range of normal galaxies, since their SED is dominated by 
the host, especially at UV restframe wavelengths. Up to 
now, we are unable to separate them from starburst galaxies 
with CADIS photometry. In any case, there should be few 
objects with R < 23 per field. 

- We do not really expect a BL Lac object in our fields, since 
only a few hundred of them are known on the entire sky, 
and CADIS will survey in total ~ 1/200,000 of the sky. 

Therefore, we conclude that classifying objects into stars, 
galaxies and quasars would be sufficient for classifying more 
than 99% of the objects in the catalog. In this terminology, we 
included all broad emission-line AGNs into the term quasar. 
Using these classes, the multi-color object catalog of CADIS 
has been partitioned into subsamples and used as an input to 
four applications, for which completeness and contamination 
by wrongly classified objects can make an important differ- 
ence: 

- Deep star counts are used to probe the galactic structure 
and the stellar luminosity function. Our multi-color clas- 
sification can separate stars and galaxies even at faint lev- 
els where the morphology is difficult to measure. Conse- 
quently, compact galaxies are also eliminated as contami- 
nants from the list of stars. CADIS probes the Milky Way 
on different lines of sight and allows to derive stellar den- 
sity distributions. This way a clear signature of a thick disk 
was found in the first two fields analysed (Phleps et al. 
2000). 

- Galaxies of / < 23 and z ^ 1 were used to investigate the 
evolution of the galaxy luminosity function with redshift. 
Such an analysis is particularly sensitive to selection ef- 
fects and K corrections. By not excluding morphologically 
stellar objects, we include an extra population of compact 
galaxies, which at R = 21 ... 23 and 1" seeing make up 
20% of all galaxies. The many wavebands in the CADIS 
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filter set give us complete SED information and provide ac- 
curate restframe B-band photometry directly from the data, 
as long as the redshift estimate is correct. This way we do 
not need estimated K corrections at least out to z ~ 1. Also, 
the accuracy of the multi-color redshifts allows to work 
with the full multi-color sample of ~ 10000 galaxies when 
completed. The first study with 2779 objects maps out the 
luminosity evolution which is clearly differential among the 
galaxy types where starburst galaxies show a steepening of 
their luminosi ty function with redshift and an increasing 
space density ( Fried et al. 2000 ). 

- Galaxies from our medium-deep K-band images (K' < 
19.5) were used for a number count analysis. Again, the 
selection of galaxies works to rather deep limits and even 
compact galaxies are taken into account. This study estab- 
lishes the currently largest statistics among medium-deep 
K-band surveys and settles earlier divergence on K-band 
counts in the range of 16.5 < K' < 19.5 (Huang et al. 
2000). 

- Quasars with R < 22 were used for a first comparison with 
expected counts. This rare class is most sensitive to contam- 
ination, of which the CADIS classification is almost free. 
Completeness appears to be extremely high, given that the 
CADIS counts show an excess in faint high-redshift (z > 2) 
quasars. Also, a few interesting quasar pairs at separations 
on the order of ~ 1' — 5' were spectroscopically confirmed 



(Wolf et al. 1999) 



3. The CADIS data 

3.1. Basic data characteristics in CADIS 

The CADIS database contains information on multi-band pho- 
tometry, morphology and position of each object. The pho- 
tometry will encompass data from the four broad-band filters 
CADIS-B, CADIS-R, J and K' as well as from 13 medium- 
band filters when completed. This photometric database is col- 
lected over five years at Calar Alto Observatory in Spain us- 
ing the focal reducers CAFOS at the 2.2-m-telescope, MOSCA 
at the 3.5-m-telescope and the prime focus near-infrared cam- 
era OMEGAprime at the 3.5-m-telescope (Bizenberger et al. 
1998). The basic data reduction steps like flatfielding, cos- 
mic correction and coadding dithered frames are done with 
the MIDAS software package in combination with a dedicated 
CADIS reduction context based on routines from MPIAPHOT 
(by Meisenheimer & Roser) 

Depending on their spectrum objects are detected in dif- 
ferent bands with different signal-to-noise ratio. Especially, 
faint emission-line objects can be quite well detected in narrow 
bands containing the line. Therefore, object search is done on 
the sumframe of each band with SExtractor software (Bertin 
& Arnouts 1996) and the filter-specific object lists are then 
merged into a master list containing all objects exceeding a 
minimum S/N ratio on any of the bands. For merging all ob- 
jects are considered identical which fall into a common error 
circle of 1", while the typical seeing is 1'.'5. The positions of all 



detections in the different color bands are then averaged into a 
final position. 

Although the object morphology is determined by SExtrac- 
tor on each sumframe, we use our own morphological analysis 
based on MPIAPHOT. The final morphology of an object is 
determined on the sumframe where the object shows up with 
the highest S/N ratio. After comparison with the typical PSF of 
stars, objects are sorted into the classes stellar and extended. 

The multi-color classification is based on color indices 
which should be measured as accurate as possible. There- 
fore, an accurate relative calibration of the different filters is 
very important. The absolute calibration is not relevant for the 
calibration, but matters for flux-limited counts or luminosity- 
dependent studies. For statistically correct results of the classi- 
fication we need not only accurate flux data but also accurate 
individual errors for them (see paper I, Sect. 2). We achieve 
this by measuring the object fluxes on each individual image 
and deriving the error from the variance of the resulting values. 

We get an optimum signal-to-noise ratio by integrating the 
photons over an aperture with a Gaussian weight distribution 
( Meisenheimer & Roser 1993 ). In each image the aperture is 
located at the same position on the sky and its size and weight 
distribution is adapted to the seeing of the frame. Every im- 
age gets a weight aperture that simulates a Gaussian smoothing 
to a common seeing before the photon counting, in order to 
make sure that always the same fraction of an objects intrin- 
sic light distribution is probed. The obtained flux is calibrated 
by the standard stars to yield the accurate flux value for point 
sources inside a virtual aperture of infinite size. For non-point 
sources the flux is underestimated by some degree depending, 
and total fluxes are estimated from morphological parameters. 
See Meisenheimer et al. 2000 for a detailed description on all 
aspects of the CADIS data reduction. 

For photon counting, the coordinates from the master list 
are transformed into the coordinate system of each individual 
frame taking not only translation and rotation into account, but 
also scaling and distortion differences resulting from the use of 
different imaging instruments. The measured counts are trans- 
lated into physical fluxes outside the terrestrial atmosphere by 
using a set of tertiary spectrophotometric standard stars we es- 
tablished in the CADIS fields. For these standard stars we know 
the physical fluxes in every CADIS filter, and therefore the cal- 
ibration of each image is independent of the photometric con- 
ditions during the exposure. 

3.2. Color indices, flux units and errors 

Since we present CADIS results in terms of a magnitude scale 
not commonly used by optical astronomers, we start with a lit- 
tle introduction on basics of magnitude and color systems here. 
A color index describes an object's brightness ratio in any two 
chosen filters and is usually given in units of magnitudes. There 
are several definitions for the zeropoints of the magnitude scale, 
an astronomical definition handed down from history and an 
increasing number of modern physical definitions (e.g. in cgs- 
units). The zeropoint of the astronomical magnitude scale was 
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Fig. 1. The quantum efficiency of all CADIS filters after taking the entire observing system into account. The set contains four 
broadband filters, CADIS-B, CADIS-R, J and K' (not shown), and 13 mediumband filters. All filters are used for classification 
and redshift estimation. 



for a long time the A0V star Vega = a Lyr, so an object's mag- 
nitude was defined for any filter as 



-2.5\og(F obj /F Vega ) , 



(1) 



and astronomical color indices between two filters A and B 
were given by 



tua — tub = —2.5 log 



F bj,A 



F 



obj,B 



2.5 log 



Vega, A 



F\ 



Vega, 



(2) 



The issue of brightness calibration became a little more 
complex by improvements in detector systems. Different filter 
systems were used with different detector types and introduced 
certain inconsistencies in the calibration. Nowadays, Vega is 
not defining the zeropoint anymore but remains rather close 
to it with small filter-dependent deviations of \Am\ 0™05. 
However, since the Vega spectrum has a highly non-trivial 
shape on a physical flux scale, values of astronomical color 
indices do not properly convey the related physical flux ra- 
tios. The first physical magnitude scale was introduced by Oke 
(1964, 1965, 1974) and called "AB magnitude". It is defined 
as: 



ABmag = -2.5 log F v - 48™60 

with F v in ergcm~ 2 s _1 Hz -1 , 



(3) 



so objects with F v = const have all their AB colors equal 
to zero. The denomination "AB" originates from a program 
variable in J. B. Oke's reduction software for the calibration 



of his standard stars (Oke 1998). In the recent past, the "ST 
magnitude" was defined as (Walsh 1995): 



STmag = -2.5 log F x - 21™10 



(4) 



with F\ in erg cm 2 s 1 A 1 



so objects with F\ = const have their ST colors equal 
to zero, where "ST" means probably "Space Telescope". In 
CADIS flux values are given in units of photons per (m 2 s nm) 



as it is done in X-ray astronomy. Therefore, we feel inclined to 
introduce a third physical system, the "CD magnitude", defined 
as 



CDmag = -2.5 log F phot + 20 m 01 



(5) 



with Fphot in 7 m 2 s 1 nm 1 



so objects with F p } lo t — const have their CD colors equal 
to zero. The name "CD" relates to CADIS and to the diplo- 
matic choice of this magnitude, that is just the average of AB 
and ST magnitude for any given object. The three flux scales 
mentioned are related by 



vF v = hcF phot = XF; 



(6) 



All these magnitude systems are in fact designed to have a 
common zeropoint at a wavelength of Ao = 548 nm, so every 
object observed through a quite narrow filter centered there will 
have 



ABmag = CDmag = STmag = astronomical mag 



(7) 



For most CADIS objects photon fluxes are given in rather 
practical units, since an object of V » 20 has just a flux 
nm -1 at A , while Vega has almost exactly 
nm -1 . As an input to our multi-color classifi- 
cation we define color indices on the basis of CD magnitudes 
by 



of 1 7 m s 

10 8 7m~ 2 s~ 



mi 



1712 = —2.5 log 



F, 



phot,l 



F 



phot,2 



(8) 



and approximate the corresponding errors for well-detected 
objects (> 5cr) by 

Cr mi -rn 2 = \J {<y Fphot jF phot .l) 2 + (crF phot , 2 /Fp hot ,2) 2 ■ (9) 

In paper I, we had discussed the relevance of a common 
base filter for the various color indices, which is supposed to 
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Fig. 2. This diagram shows a few selected color-color plots containing all point sources with R < 22 in the CADIS 16h-field of 
which are roughly 75% stars, 20% compact galaxies and 5% quasars. By plotting the CADIS objects in black and the Pickles 
(1998) library in grey a number of features can be seen in the diagrams: E.g., the R-I color index in the upper right panel 
illustrates a calibration offset of 0™05mag. The B^-65 and 522-R color indices in the lower two panels already contain stellar 
population information: the CADIS objects (mostly halo stars, population II) fall preferentially onto the upper one of two arms in 
the color distribution of G and K stars (having Si B — R ^ 1), while most library stars (residing in the Galactic disk, population 
I) are in the lower arm. 



have relatively small flux errors in order to keep the color er- 
rors as low as possible. Since CADIS observes a small num- 
ber of broad bands and a larger number of medium-band filters 
(see Fig. [l]), we decided to form color indices from broad bands 
neighboring on the wavelength axis, i.e. B-R, R-I and I-J or I- 
K depending on the availablity of the data. Each of the medium 
bands we combine with the most nearby broad-band in terms of 
wavelength, which then serves as a base filter for the medium- 
band color indices, e.g. B^486 or 605-R, where letters denote 
broad bands and numbers represent the central wavelength of 
medium-band filters measured in nanometers. 

Our I band is in fact a medium-band filter centered at the 
location of an M star pseudocontinuum feature around 815 nm. 
It is among the first filters observed on every field and therefore 
the most suitable choice for another base filter between R and J. 
For the classification this means, that we use a few deep broad 
bands to fit the global shape of the SED, and then use a few 
groups of medium bands around each deep broad-band to fit the 
smaller-scale shape locally. This scheme is superior to single 
fit with all filters, since it can tolerate changes in the global 



SED due to reddening or cosmic variance, while being more 
sensitive to distinctive spectral features on a smaller scale. 

Before entering a CADIS object table into the classification 
procedure, we check the calibration of the color indices by a 
comparison with the star colors from the Pickles (1998) library. 
This library is also used for the multi-color classification itself, 
so its colors are required to be consistent with the star colors 
in the CADIS data for a successful classification. Since we cal- 
ibrate our object fluxes by Pickles spectra in the first place, 
consistent colors are expected unless a fault in the reduction 
or calibration process altered the flux results. The calibration 
check involves a visual inspection of the color-color diagrams 
of all point sources with R < 22 in a CADIS field (see Fig. g), 
among which 75% should be stars and the scatter due to flux 
errors should be small. Although some faint compact galaxies 
and quasars are contained among the point sources, the check 
is still possible since they are only a minority of objects likely 
to show very different colors. We then ignore these outliers and 
focus on the bulk of objects, which are almost entirely normal 
stars. 
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R (mag) 465 (mag) 

Fig. 3. This diagram illustrates the flux errors in two example filters and has all objects detected in the 16h-field with an error of 
less than 10% in the broad-band filter CADIS-R and the medium-band filter centered at 465 nm plotted. The sharp lower error 
limit of parabolic shape is given by pure photon shot noise. Most objects have larger errors due to sub-optimal flatfielding and 
further reduction effects. The R band shows two well-populated arms since a central square part of the field has been imaged on 
a smaller CCD before and therefore reaches deeper when all existing images are combined. The thinly populated brighter arm in 
the 465-image results from objects around the field edges, which are only seen in few exposures due to dithering. The few objects 
in the upper left area of the diagram have unusually large errors due to uncorrected detector defects or other artifacts affecting 
one of the involved images. 



Any calibration mistake showing up as a zeropoint shift in 
the color-color diagrams leads to an inspection of the reduction 
process until the causing fault is found. We aim to detect shifts 
of <; 0^03, which is our goal for the accuracy of the relative 
calibration. Larger offsets in color indices would limit the per- 
formance of the multi-color classification, which is otherwise 
limited by color differences between real objects and the library 
as well as by the design of the libraries which itself follows the 
0™03-goal. As any calibration of absolute fluxes is probably 
insecure on the order of 10% and does not matter for the multi- 
color classification anyway, there is no independent check for 
this besides a check for consistency among the tertiary CADIS 
standards themselves. 

This calibration check highlights already the difficulty of 
disentangling valid astrophysical information from data pro- 
cessing problems. The two diagrams containing the color in- 
dices B-465 and 522-R show an apparent disagreement be- 
tween measurements and library colors for G and K stars which 
have ^ B — R ^ 1, while the agreement is good among F 
and M stars. In these two mediumband filters different stellar 
populations form two separate arms: The data are dominated 
by the halo population which lies above the disk population 
that makes up most of the library. 

3.3. Photometry and morphology data 

The final CADIS catalog is defined to contain all objects within 
a circle of 400" radius around the field center, provided they 
have been detected at least at a 6-cr level in any coadded sum- 
frame from a CADIS filter or Fabry-Perot band. These lim- 
its have been chosen to avoid spurious effects from the field 
edges and unnecessary overhead from dealing with noise ob- 
jects. This round field will contain photometry from almost all 
filters observed with CAFOS, but those observed with MOSCA 
will only be available for objects in the central 10' x 10'. Table |] 



summarizes the filters, exposure times and limiting magnitudes 
for the three fields analysed here, illustrating on which data 
the multi-color classification is currently based. These fields 
have their imaging program completed by 80% on average and 
Meisenheimer et al. (2000) reports on the final exposure depths 
to be reached. 

Object photometry is performed on each single frame, so 
that the scatter among the individual flux measurements gives 
an estimate for the flux error. Since an object is placed on dif- 
ferent locations in the single frames, the error obtained this 
way includes not only photon shot noise but also effects intro- 
duced by an imperfect flatfield or not properly corrected night- 
sky fringes. If the scatter is unplausibly low due to a chance 
coincidence of the single values, we still assume a minimum 
error corresponding to photon noise. Fig. || shows the distribu- 
tion of flux errors and magnitudes in the broad R band and the 
medium-band filter at 465 nm for all objects detected with less 
than 10% flux error in the 16h-field. 

The two main sources of problems for proper color deter- 
mination are object variability and object blending. The multi- 
color database of CADIS is collected over several years, which 
means that different filters are exposed at different epochs. 
We aim to identify variable objects by repeating the broad R- 
band images several times during the survey. This way we have 
found variability in a fraction of our spectroscopic quasar sam- 
ple, some variable stars and even a high-redshift supernova 
candidate with no detectable host galaxy. Object blending is 
a problem we can not resolve, and for close chance projections 
we do not even have a possibility to flag objects on the basis 
of imaging data. In one case we only realized the true nature of 
a blended object with VLT spectroscopy in 0'.'5 seeing. Object 
blending is furthermore a problem for morphological measure- 
ments. 

Fig.Q shows the morphological properties of all objects in 
the CADIS 16h-field as determined by MPIAPHOT. Although 
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Fig. 4. Stellarity over R band magnitude for all objects in the 
16h-field detected at least at a 6-er level in any CADIS filter or 
Fabry-Perot band. 

MPIAPHOT determines the semi-width of major and minor 
axis as well as an axis angle, all information available is con- 
densed into a stellarity parameter based on the average PSF 
on a frame. This stellarity parameter ranges from to 1000 
with values above 800 being considered as point source images. 
Since the stellarity value of an object is always determined in 
the filter where it is detected with the best signal-to-noise ra- 
tio, some objects in Fig.|] are very faint in the R-band but 
still clearly considered point sources, because they are much 
brighter in the far-red or even K-band and their morphology is 
easily determined there. The morphological star-galaxy sepa- 
ration starts failing at R <; 21 where already many galaxies 
appear compact (stellarity > 800) in our typical seeing of l'/5. 
So, the multi-color classification provides a strong improve- 
ment already in this brightness regime, i.e. for most survey ob- 
jects. 

4. Multi-color classification 

Objects are classified independently by their photometric and 
morphological information, since the morphologic information 
transmitted through average CADIS seeing (~ l'/5) is not very 
useful in contrast to the well-resolved spectrophotometric in- 
formation provided by a dozen filters. A cross check of results 
turns out to be useful in cases where the CADIS color space 
does not provide enough discriminative power to distinguish 
overlapping classes. 

Sect. 4. 1 outlines our classifications scheme, which has 
been published in full detail in paper I. In Sect. 4.2 we discuss 
results from Monte-Carlo simulations of CADIS multi-color 
data at various magnitude levels, where we like to see what 
classification performance we can expect from the CADIS fil- 
ter set. Sect. 4.3 presents statistical properties of the classi- 
fied object catalogs for the three fields reported here. Finally, 
Sect. 4.4 discusses the fraction of unclassified and strange ob- 
jects, whereas a detailed discussion of a few peculiar objects 
challenging the classification can be found in the appendix. 



Table 2. Filters and 10-er-magnitude limits for the three CADIS 
fields presented here. Filters not exposed or reduced yet are left 
blank. Exposure times are given for the 16h-field as an exam- 
ple, and resemble true exposures for the 2.2-m-telescope, or 
equivalent 2.2-m-exposures in italics if the observations were 
done at the 3.5-m-telescope. The given magnitudes are as- 
tronomical (Vega-normalised) 10cr-limits estimated from the 
magnitude distribution of those objects with flux errors mea- 
sured to be roughly 10%. 



Acen/fwhm (nm) 




m lh 


m 9h 


m 16f 


461/113 (B) 


6200 


24.6 


24.8 


24.7 


649/172 (R) 


5300 


24.1 


24.5 


24.1 


815/25 (I) 


30700 


21.7 


22.9 


22.9 


1200/000 (J) 






21.5 




2120/340 (K') 


22500 


19.9 


20.0 


19.5 


396/10 




24.0 


23.4 




465/10 


38000 


23.6 


23.9 


24.1 


489/20 






23.5 


23.7 


522/15 




24.5 


24.3 


23.4 


535/14 






24.1 


24.0 


611/16 


11000 


24.1 


23.4 


23.4 


628/16 


23000 


23.5 


23.6 


23.6 


683/18 






23.5 




702/19 




23.7 


23.7 




752/28 


15600 


23.1 


22.5 


22.9 


855/13 






22.3 


22.4 


909/31 


31600 


21.2 


21.8 


22.3 



4.1. Classification scheme 

Our multi-color classification scheme essentially compares the 
observed colors of each object with a color library of known 
objects. This library is assembled from observed spectra by 
synthetic photometry performed on the CADIS filter set. As 
an input we used the stellar library from Pickles (1998), the 
galaxy template spectra from Kinney et al. (1996) and the QSO 
template by Francis et al. (1991). From the latter, we gener- 
ated regular grids of QSO templates ranging in redshift within 
< z < 6 and having various continuum slopes and emis- 
sion line equivalent widths. Also, a grid of galaxy templates 
has been generated for < z < 2, and contains various spec- 
tral types from old populations to starbursts. 

Objects are classified by locating them in color space and 
comparing the probability for each class to generate the given 
measurement. Given the photometric error ellipsoid in the n- 
dimensional color space, each library object can be assigned 
a probability to cause an observation of the measured colors. 
For a whole class, this probability is assumed to be the average 
value of the individual class members, resulting in relative like- 
lihoods for each object to belong to the various classes. Empha- 
sis should be put on the fact, that quasars are selected by a pos- 
itive criterion (quasar-like objects) and not by an exclusive rule 
(unusual objects). We also note, that we do not use any abso- 
lute magnitude information or cosmological knowledge about 
the typical abundance of different object types. This could be 
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Table 3. Classification matrix for objects of R — 23 and 
R = 24 as derived from Monte-Carlo simulations. An input 
vector containing a true number distribution of objects among 
the three object classes would be mapped by this matrix onto 
a classified distribution among four classes. Numbers below 
0.005 are left blank. Due to rounding numbers in one column 
do not always add up to 1.00. At R = 22 the three main diago- 
nal elements are essentially all 1.00. 



galaxies 



quasars 





true class, R 


= 23 


true class, R 


= 24 


classified as 


star 


galaxy 


quasar 


star 


galaxy 


quasar 


star 


0.93 




0.01 


0.48 


0.02 


0.02 


galaxy 




0.95 


0.01 


0.04 


0.70 


0.02 


quasar 


0.01 




0.91 


0.02 


0.02 


0.60 


unclassified 


0.06 


0.04 


0.07 


0.47 


0.26 


0.36 



Table 4. Mean error of multi-color redshifts as obtained from 
the Monte-Carlo simulations for the filter set of the 16h-field. 



object type 


R = 22 


R= 23 


R = 24 


quiescent galaxy 


0.01 


0.05 


0.13 


starburst galaxy 


0.04 


0.12 


0.25 


quasar @ z < 2.2 


0.27 


0.51 


0.78 


quasar @ z > 2.2 


0.10 


0.17 


0.22 



added to the scheme in order to reduce the global rate for failure 
in the classification, but it would also suppress the likelihood to 
identify members of the rare quasar class. 

This statistical approach contains two implicit concepts of 
unclassifiability. For each object we normalized the relative 
likelihoods such, that the three classes would always add up 
to 100%. But if the photometric error ellipsoid contains an 
overlap of several classes, the likelihood is distributed among 
them and the true class may be unclassifiable. This occurs when 
the available colors are not discriminating between the classes 
or when the measurement error is too large. Unusual objects 
which are far away from any library in color space, we call 
strange, if their measurement is statistically inconsistent with 
being caused by any library object at more than a 3-er level. 

Since the galaxy and quasar libraries resemble regular grids 
in redshift and spectral type, these parameters can also be es- 
timated from the observation. For this purpose, we use an ad- 
vanced Minimum-Error- Variance estimator (MEV+) in order 
to determine a redshift estimate as well as a an error estimate 
(see paper I for all relevant details on the multi-color classifi- 
cation and redshift estimation procedure). 

4.2. Monte-Carlo simulations 

We carried out a range of Monte-Carlo simulations to check 
what classification performance we can expect from our 
method in combination with the CADIS data. We used simu- 
lated multi-color observations of stars, galaxies and quasars as 
an input into our algorithm and compared input and resulting 
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Fig. 6. Monte-Carlo simulations for the photometric redshifts 
of galaxies and quasars with R = 22 ... 24 according to the 
MEV+ estimator. Among galaxies black dots denote quies- 
cent systems and grey dots are starburst galaxies. This diagram 
shows the redshift estimates for all galaxies, including the ten- 
tative ones, but only for those quasars passing the classification 
limit of 75%. 



classification. The simulated input list was prepared by using 
objects from our color libraries. 

We assumed a certain R-band magnitude and calculated 
individual filter fluxes and corresponding errors for each ob- 
ject. Then we scattered the object fluxes according to a normal 
distribution of the flux errors. Finally, we recalculate resulting 
color indices and index errors and use this object list as an input 
to the multi-color classification. From the stars we use just all 
131 library members as test objects. From the galaxies we take 
only every third member of the present library giving us 6700 
objects. From the quasar library we use every seventh object 
resulting in 6450 quasars per test run. The simulations for the 
quasars are appropriate for surveys measuring the colors wihtin 
a short period of time since no variability is incorporated skew- 
ing color indices in a database collected over a long time as it 
is the case with CADIS. 

These simulations show us how well the classification can 
possibly work, assuming that real objects will precisely mimic 
the library objects. Every real situation will contain differences 
between SED models and SED reality, sometimes called "cos- 
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Fig. 5. Monte-Carlo simulation for the CADIS-classification of stars, galaxies and quasars with R = 22 ... 24. The probability 
for a simulated object to be assigned to its original class is plotted over the color B — R for stars and over the redshift for 
galaxies and quasars. The B — R color is in CD magnitudes, offset by — 0'. n 67 relative to Vega calibrated colors. In case of the 
galaxies black dots denote quiescent galaxies (SED<60) and grey dots are starburst systems (SED>60). For bright objects the 
performance is limited by a systematic uncertainty of 3% assumed as a minimum error for the color indices. 



mic variance", which will worsen the performance of every real 
application. Nevertheless, the simulation highlights the princi- 
pal shortcomings of the method and the chosen filter set in par- 
ticular. 

We run these tests for stars, galaxies and quasars with mag- 
nitudes of R =22, 23 and 24, in order to see how the classi- 
fication performance degrades from optimum to useless with 
decreasing object flux. We expect that the classification shows 
its best possible performance already at R = 22, where the cal- 
ibration uncertainty of 3% and cosmic variance will dominate 
over the photon noise. Finally, at R = 24 we do not expect 
the classification to be reliable anymore, but we would like to 
see how well the redshift estimation still works. Afterwards, 
we would like to compare these simulations with the real per- 
formance derived from spectroscopic identifications, most of 
which were obtained on the 16h-field. For a fair comparison, 
the simulation uses just the filter set presently available on this 
field, which is lacking the four filters 396/10, 683/18, 702/19 
and J for full performance. 

As a result we obtain a classification and potentially a red- 
shift estimation for every simulated object. The sample results 
can be condensed into a classification matrix showing what 
fraction of an input sample from a given class is classified into 
the various output classes, and especially what objects are seen 



as unclassifiable . Table [3] shows the resulting matrix, which 
for bright objects (R = 22) basically resembles an identity 
map corresponding to a classification without mistakes. The 
main diagonal elements contain the completeness of the three 
classes, while the other elements count the misclassifications. 

At R = 23, we expect to loose roughly 10% of the stars 
and quasars to unclassifiability. At R = 24, the survey has be- 
come almost useless, since about half of the stars and quasars 
are not classified correctly anymore. Most incorrectly classi- 
fied objects are unclassifiable and only a minority of them is 
scattered into another class. 

Especially, quasars seem to be not strongly contaminated 
by false candidates. Only at R > 22 the contamination should 
matter. According to the simulations roughly 0.5% of the galax- 
ies at R — 23 and 2% of galaxies at R = 24 are scattered into 
the quasar class. If, e.g., the ratio of quasars to galaxies was an 
the order of 100:1, the contaminants should amount to about a 
third of the candidates at R = 23 and should be the dominant 
fraction at R = 24. 

Faint objects suffering from unclassifiability have usually 
rather featureless continua. Fig. |5] shows a few plots for differ- 
ent classes and R-band magnitudes, illustrating the probabil- 
ity for recovering the correct class of a simulated object in de- 
pendence of a characteristic parameter, which is color for stars 
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Table 5. Classification statistics for the 16h-field. 



stellar 



extended 



R-mag 


star 


gal 


qso 


unci. 


star 
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20 


48 


2 


1 





1 


23 


1 
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21 


45 


8 


3 





3 


81 





2 


21.. 


22 


66 


26 


4 


3 


19 


177 


4 


6 


22. . 


23 


87 


89 


11 


10 


71 


347 


6 


38 


23.. 


24 


52 


160 


7 


108 


88 


493 


47 


362 



and redshift for extragalactic objects. The faint unclassified ob- 
jects include most stars except for M stars, which have rather 
characteristic spectra and are bright in the far-red wavelength 
range, where data from many medium-band filters is available. 
Starburst galaxies at higher redshift (z > 1) and quasars at 
low-redshift (z < 2.2) show both rather blue spectra, and are 
not differentiated at the faintest levels. Also, galaxies around 
zero redshift are not easily recovered, since the CADIS filter 
set does not contain a U-band or mediumband filters bluewards 
of 465 nm. 

The quality of the redshift estimation for galaxies and 
quasars is shown in Fig. |[ where the population on the diag- 
onal are correctly estimated objects and the deviations are mis- 
takes. Most objects scatter around the diagonal and few devi- 
ate by a large amount in a non-Gaussian distribution which are 
catastrophic mistakes, where the classification considers more 
than one redshift value as likely, and decides for the wrong 
one. The variance of the true estimation error obtained from 
the simulations is summarized in Table Generally, quiescent 
galaxies are estimated more accurately than starburst galaxies 
due to stronger continuum features, which mainly are the red 
continuum shape on the blue side of the 4000 A-break. There- 
fore, also the limiting magnitude for a given redshift accuracy 
is about one magnitude deeper for the quiescent galaxies. For 
a similar reason quasars at z > 2.2 work better than those at 
lower redshift, since they display a distinctive continuum step 
across the Lyman-a line. 

4.3. Classification statistics 

We applied the multi-color classification to the object cata- 
logs of the fields lh, 9h and 16h. Although the imaging data 
collected so far differ in terms of depth and available filters, 
we expect the classification performance to be roughly similar 
and use the classification results for various scientific applica- 
tions. Statistical properties of the classified object samples are 
demonstrated in the following figures and tables: 

Fig. [7] shows magnitude histograms of all objects contained 
in each of the three fields and the fraction classified success- 
fully by color, which means that these objects focus at least 
75% of their class membership probability onto a single class. 
At R < 22, the multi-color classification is more than 97% 
complete. Towards fainter magnitudes, first the classification 
becomes incomplete and then even fainter the object counts as 



well. Tentatively, we consider unclassified objects to be galax- 
ies, simply because they are expected to be the dominant popu- 
lation at faint levels in extragalactic surveys. Fig.^ also shows 
histograms of all galaxies in each field and the fraction that re- 
ceived an MEV+ redshift estimate. The subsample of galaxies 
with MEV+ estimates is at least 97% complete at R < 23. 

Fig.|| shows galaxy redshift histograms for each field, 
where the visible features are mostly reflecting cosmic large- 
scale structure. Of course, the true distribution is smoothed and 
potentially distorted slightly by the estimation errors. Also, ob- 
jects at z > 2 are excluded by definition of the galaxy library. 

Table|| lists the distribution of multi-color classes over 
magnitude and morphology for the 16h-field, where the con- 
tribution of compact galaxies can be seen as well as certain 
numbers of seemingly extended faint stars. There are two rea- 
sons for these apparent contradictions: First, there is truely a 
population of galaxies not resolved in our seeing and also a 
small fraction of stars that appear extended because of chance 
projections with fainter extended objects or because they are 
double stars. Object blending is of course also a problem for 
color determination and classification. Second, also the mor- 
phology information degenerates towards fainter magnitudes as 
does the color classification, due to PSF variations and photon 
noise. As demonstrated in Fig.Q, the morphology information 
has become useless for not clearly extended sources at R ~ 24. 

Fig.^ illustrates the location of different object classes in 
color-color diagrams always showing B-R vs. R-I. Obviously, 
the main classes stars, galaxies and quasars share common re- 
gions, which they do in any two-color diagram since there is no 
possible choice of three filters that separates the three classes 
unambigously. Since our classification takes all multi-color in- 
formation simultaneously into account, we can still distinguish 
the classes. 

The bottom row of panels in Fig.^J depicts unclassified 
objects on the left, most of which are supposedly faint blue 
galaxies overlapping in color space with low-redshift quasars 
and bluer stars. The center panel contains galaxies without an 
MEV+ redshift estimate, half of which are also unclassified by 
the multi-color algorithm. These objects are at the faint end of 
the magnitude range used and lie in a region of color space 
where blue galaxies of quite different redshift clump so closely 
together, that the typical photon noise of faint objects allows no 
clear redshift estimation any more. 

Finally, the right panel shows the strange objects, defined 
to be inconsistent with any library member at least on a 3-a 
level. They amount to ~ 1% of the entire object catalog with 



R < 23.5 in the 16h-field (see Sect. 4.4 for details). The level 
of inconsistency for a truely strange object is of course deter- 
mined by its brightness via the magnitude errors. At faint levels 
like R » 24, we would require a very strange object indeed to 
still notice it. 



4.4. Unclassified and strange objects 

To investigate the issue of unclassifiability further, we took a 
closer look at the unclassified objects with R < 22 in the 16h- 
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Fig. 7. The panels in the top row show number histograms for all objects in individual field catalogs (grey line) and for the 
subsample of objects classified by concentrating more than 75% probability on a single class (black line). The remaining objects 
are tentatively classified as galaxies (on purely statistical grounds). The panels in the bottom row show which part of all galaxies 
(grey line, including tentative galaxies) get a successful MEV redshift estimation (black line). The remaining galaxies have an 
estimated redshift error of a, > 0.25. 
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Fig. 8. These histograms show the redshift distribution of all galaxies with R < 23.5 and a successful MEV estimate for each 
field. The galaxy library does not contain objects at redshifts of z > 2, which is therefore excluded as an estimate by definition. 



field. These are 12 objects containing six artifacts, one blended 
double source with 2" separation and only five intrinsically un- 
classified objects. The six artifacts should ideally not appear in 
the catalog at all, because they are close to the edge of the field 
in one filter which has probably altered their photometry, or be- 
cause they are second detections of a bright source with differ- 
ent position and photometry. In fact, data artifacts blow up the 
catalog sample of unclassified objects to some extent and ex- 
plain part of the difference observed between the Monte-Carlo 
simulations and the real data. 

The five intrinsically unclassified objects amount to less 
than 1% of the catalog sample with R < 22. They are all 
extended, not strange and show galaxy probabilities ranging 
from 40% to 75% with the remaining probability assigned to 
the star class. One object is a galaxy with 15" visible diam- 
eter that is supposedly at extremely low redshift (z < 0.05). 
The other four galaxies are estimated to be in the range of 
z = 0.2 ... 0.4, which has some overlap with stars as we know 



from the faint spectroscopic sample. Three objects lack data for 
the filters observed with MOSCA, since they lie in the outer 
range of the round CAFOS field not covered by the squared 
MOSCA field. Obviously, with fewer filters the classification 
reaches less deep. We summarize, that the presence of the ob- 
served unclassified objects can be explained and is consistent 
with the expectations from the simulations. 

We also investigated the strange objects with R < 24 in 
the 16h-field. Among these 24 objects are nine artifacts and 15 
intrinsically strange objects, corresponding to ~ 0.6% among 
a catalog of 2582 objects with R < 24 in total. The 15 objects 
are: 

- nine rather obvious M stars with colors deviating from the 
not entirely representative library which does not cover the 
whole natural spread in metallicity 

- three spectroscopically confirmed quasars, but quasar col- 
ors tend to look strange anyway due to variability, varia- 
tions in emission-line strength, or diverse and odd spectral 
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Fig. 9. These B-R vs. R-I color-diagrams illustrate the location of all 1785 objects with R < 23.5 in the 16h-field separated in 
the following groups: The panels in the top row show the class populations after final classification. The panels in the bottom 
row show unclassified objects, i.e. tentative galaxies (left panel), galaxies without a successful MEV redshift estimate (center 
panel) and strange objects (right panel). Color indices are in units of CD magnitudes, where Vega has B — R = —0.67 and 
R-I = -0.53. 



characteristics commonly observed but not covered by the 
library 

- two z ~ 1-galaxies combining a very red 4000A-break 
around 800 nm with a rather flat spectrum bluewards. 

- one later M star with an unusually blue B-R color index of 
0.53 in CD magnitudes (= 1.2 on a Vega normalised scale) 

Only the last three objects are not explained by systematic 
effects and could well be chance projections of a faint blue ob- 
ject with a brighter red one. But indeed, a Gaussian distribution 
should naturally contain 0.25% objects exceeding the 3-er level 
of strangeness corresponding to six objects in the R < 24 sam- 
ple of the 16h-field. The other two fields also do not contain 
unusual numbers of strange objects. Therefore, it seems that 
hardly any exciting unusual objects are in sight which would 
call for a fundamentally new explanation. On the other hand, 
the classification quality benefits from the good agreement be- 
tween the observations and the library. 

5. Spectroscopic check of the classification 

5.7. Spectroscopic observations 

The quality of the classification and redshift estimation was 
checked by multi-slit spectroscopy in two dedicated observing 
runs with MOSCA at the 3.5-m-telescope. In July 1997, three 
multi-slit masks were observed on the 16h-field with grism 
green-500 for 4000 s, 12000 s and 16000 s, respectively, yield- 
ing 61 identifications of stars, galaxies and quasars. In October 
1997, only six more objects received identifications during ob- 
servations of two masks under bad observing conditions. The 
objects were selected to have mostly R < 22 and represent the 
range of classes, redshifts and colors found in the catalog, ex- 
cept for the quasars, where all candidates with R < 22 were 



checked. A few fainter objects were included where they fitted 
into the slit arrangement of the masks. 

Advantage was taken of further CADIS spectrocopy 
launched for confirmation of emission-line galaxies, by plac- 
ing additional objects on otherwise empty spaces of the multi- 
slit masks during three MOSCA runs in January 1998, January 
1999 and April 2000, two runs at the Keck telescope in di- 
rector's discretionary time in June 1997 and January 1998 and 
one run at VLT in November 1999. Also, some longslit spec- 
troscopy was done on bright galaxies as a backup program to 
CADIS when the seeing was above 2". Altogether, it was possi- 
ble to collect 95 more identifications in the three fields reported, 
yielding a total of 162 identified objects. 

This subsample of 162 identifications is more or less rep- 
resentative for the object catalog as a whole as illustrated in 
Fig.[l(]. In terms of galaxy SEDs the spectroscopic sample is 
not entirely representative, since at redshift z > 0.7 it contains 
no red galaxies as opposed to the whole CADIS sample. We 
note, that red galaxies are expected to receive more accurate 
redshift estimates due to a stronger 4000 A-break, and there- 
fore the whole CADIS sample should perform equal or better 
than the spectroscopic subsample in redshift estimation. 

5.2. Quality of the classification 

Table^ shows the classification matrix derived from the spec- 
troscopic cross check, separated into 103 bright objects (17 < 
R < 22) and 59 faint objects (R > 22, including 1 1 objects 
with R > 24), as well as separated into 64 stellar and 98 ex- 
tended sources to give also a cross check between morpholog- 
ical appearance and spectroscopic class. The table resembles 
effectively a 4-D matrix and allows the following conclusions: 
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Fig. 10. These diagrams show various properties of the spectroscopic subsample (black dots, from all three fields) among all 
objects (grey dots, only from the 16h-field) in the left panel and among the respective galaxies in the center and right panel. The 
left panel shows just a color-color plot, the center panel plots redshift over R-magnitude and the right panel redshift over galaxy 
SED. In the left and right panel, a magnitude limit of R < 23.5 has been used. 
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Fig. 11. The estimation quality for the redshift in the spectroscopic galaxy sample is shown in these diagrams. The left panel 
plots the multi-color redshift vs. spectroscopic redshift with the highest redshift galaxies residing at z rj 1.2. The center panel 
shows the error of the estimate (Az — z mu iti-coior — z S p e ctroscopic) over redshift, and in the right panel Az is plotted over the 
R-band. Half of the galaxies are estimated within an error margin of ±0.02. 



Table 6. Classification matrix for the spectroscopic subsample 
of 162 CADIS objects, where top rows are objects with 17 < 
R < 22 and bottom rows are objects with R > 22 (on average 
R = 23). 



classified as 
17 < R < 22 


stellar objects 
true class 
star galaxy quasar 


extended objects _ 
true class 
star galaxy quasar 


star 
galaxy 
quasar 
unclassified 


22 

4 2 
15 


1 

58 

1 


R > 22 






star 
galaxy 
quasar 
unclassified 


5 1 
1 5 

5 2 
2 


1 4 

25 
4 

3 1 



- The bright sample contains only two misclassified objects 
among 103 in total, which translates into ~ 98% correct 
classifications. The mistakes are Seyfert-1 galaxies (i.e. 
quasars) found by chance among the compact galaxies. 

- The faint sample contains 25% misclassifications and 10% 
unclassified objects, with most of them being galaxies. The 



others are one L star with R > 26 and K' = 18.5 and one 
Seyfert-1 galaxy with R = 22.9 and z = 1.40. 
The presence of many compact galaxies in the faint sample 
was confirmed emphasizing the superiority of a multi-color 
classification with respect to the morphological analysis. 
The faint sample shows quite a few galaxies contaminating 
the multi-color sample of quasars, while the bright sample 
works very fine. It remains to be shown around what mag- 
nitude level in the range R = 22 ... 24 the contamination 
by galaxies becomes critical. 

Based on our small sample of six objects, our default inter- 
pretation for unclassified objects as being galaxies appears 
reasonable. 

Extended objects are almost all galaxies. While in the 
bright sample one apparently extended looking star and one 
quasar are correctly classified by their colors, there appears 
some confusion in the faint sample. 

There are two extended looking objects in the bright sam- 
ple, which are correctly classified as a star and a quasar, re- 
spectively. The former is in fact a double star at roughly 1'.'5 
separation with R = 17.3, and the latter is a low-luminosity 
quasar at z — 1.57, where we supposedly resolved a bright 
host galaxy. The one extended looking star in the faint sample 
is an M star of R — 25, where we do not expect a successful 
morphological analysis. Another extended looking quasar is in 
fact a Seyfert-1 galaxy at z = 1.4 and R = 22.9. The spec- 
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troscopic sample contains eight more extended faint galaxies 
wrongly classified as stars or quasars. 

Among seven faint compact quasar candidates, we con- 
firmed only two, and among four faint extended candidates 
none, suggesting that we should use a morphological preselec- 
tion in the faint sample where galaxy contamination appears to 
take over. But the classification of faint objects has changed in 
the past while the multi-color database has grown. Also, mor- 
phological types may have changed with the addition of deeper 
images or such of better seeing. Given that the faint sample is 
still rather small, we can not train our classification to combine 
color and morphology into a robust procedure at this point. 

Altogether, the classification is close to ideal at R < 22, 
but at fainter levels, the abundant galaxies start contaminating 
the star class and the quasar class. In fact, the best performance 
is presently achieved by adding a morphological criterion to 
the classification that assigns automatically the galaxy class to 
spatially resolved faint objects with R > 22.5. In this scheme 
the classification produces 6 errors among 151 spectroscopic 
identifications with R < 24. 

The contaminants to the star class are normal starburst 
galaxies at z = 0.25 . . . 0.4 crossing the stellar locus around 
G type stars. Contaminants to the quasar class are partly quies- 
cent galaxies at z ~ 0.3 which cross the quasar locus around 
z « 3.5, since the 4000A-break of the galaxies mimicks the 
continuum step over the Lyman-a line. A solution to this prob- 
lem might be an inclusion of more mediumband filters blue- 
wards of 500 nm. A second confusion arises from strong star- 
burst system around zk 1.2 crossing the quasar locus around 
z w 1 . . . 2. 

In the Monte-Carlo simulations (see Sect. 4.2) for R = 22 
the classification appeared equally ideal as in the spectroscopic 
cross check. Also, the simulations for R = 23 and R = 24 
suggest shortcomings of several kinds for the classification of 
fainter objects, some of which have been observed in the cross 
check. E.g., the known confusion of quasars at z w 3.5 with 
other objects is seen in Fig. ^| as the line of dots reaching down 
to zero probability. 

The abundance of low-redshift galaxies sharing their region 
in color space with quasars causes many more galaxies to show 
up in the quasar class than vice versa. The 16h-field seems to 
contain ~ 500 galaxies in the magnitude bin of 22 < R < 23 
and ~ 1200 galaxies in the next fainter bin of 23 < R < 24. 
According to the simulations, we expect 0.4% of the galaxies 
in the brighter bin and 1.5% of the galaxies in the fainter bin 
to scatter into the quasar candidate list, which amounts to ~ 2 
objects and ~ 20, respectively. 

Still it seems, that the Monte-Carlo simulations are under- 
estimating the fraction of mistakes in the classification a lit- 
tle. E.g., the fraction of unclassified objects in the real cata- 
log sample of the 16h-field is around 10% for R = 22 ... 23 
and around 40% for R = 23 ... 24, while simulations for these 
intervals yield ~ 5% and ~ 20%, respectively. As a rule of 
thumb the simulations might be too optimistic by up to half a 
magnitude. The dominant reason for this is supposedly cosmic 
variance, i.e. differences between library and real spectra which 



exceed the tolerance given by photon noise and calibration in- 
accuracies. In addition, quasars suffer from variablity during 
the observational 5-year period of CADIS. 

We conclude, that roughly down to R — 22.5 . . . 23.0, the 
classified object catalogs can be analysed without any knowl- 
edge of the more subtle features in the data. Statistical studies 
going fainter than this have to worry about completeness issues 
and about contamination of star and quasar samples by certain 
types of galaxies. But given the large number of faint galaxies, 
any loss of galaxies to the star and quasar class is unlikely to 
be significant for statistical studies on the galaxy sample. 

5.3. Quality of the redshift estimation 

The performance of the redshift estimation among the 101 
galaxies with spectroscopic redshifts and MEV+ redshifts is il- 
lustrated in Fig. [n] and Fig. [l~2|. Here, four galaxies blended into 
two double objects have been removed from the sample, since 
their strongly differing redshifts (0.616 and 1.20 mixed in one 
pair, 0.24 and 0.729 mixed in the other pair) cause mixed col- 
ors unlikely to be useful. Within the given magnitude range of 
17 < R < 24.5 just above 10% of all galaxies have large red- 
shift errors |Az| > 0.1 which we call catastrophic mistakes. 
Examining the remaining sample leads to a distribution with 
zero mean error and a z as 0.03 rms deviation. This result was 
achieved without any color adjustment of the galaxy library 
to the real data. Among nearby galaxies catastrophic mistakes 
tend to overestimate the redshift while those at higher redshift 
are rather underestimations. 

The redshift estimation for galaxies compares also rather 
well between simulations and real data, except for the few 
catastrophic mistakes which happen even at brighter magni- 
tudes in the saturation range of the estimation quality. Our in- 
terpretation is again that cosmic variance plays the major role 
in this phenomenon. If these mistakes among bright objects 
are ignored, the redshift accuracy achieved matches perfectly 
the simulated results from Table||. We note that no spectrum 
is available for any CADIS galaxy at z > 1.2, so we are not 
in a position to check the estimation quality there. The simula- 
tions do not point at specific problems, only the general scatter 
should be large at the rather faint magnitudes expected for these 
galaxies. 

We also compared the true redshift errors Az with the er- 
rors a z estimated by the multi-color technique itself on the ba- 
sis of photometric errors and the galaxy distribution in the color 
space. The ratio Az/a z evaluates the error consistency of our 
redshift estimate. If the estimated errors were representative of 
true errors, this ratio should have a Gaussian distribution with 
an rms of 1 .0. In fact, it turns out that for 30% of the galaxies 
this inconsistency is larger than 3-cr, while the remaining 70% 
show a more or less Gaussian distribution with an rms scatter 
of 1.2 (see Fig.|l2|). This result implies that for one third of 
the spectroscopic galaxy sample, the redshift estimation pro- 
cess considers itself too accurate, supposedly a consequence of 
cosmic variance that changes the galaxy SEDs and their esti- 
mated redshifts while not changing the photometric errors. 
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Fig. 12. Most galaxy redshifts are estimated with a Az error 
variance of ~ 0.03, but ~ 10% of the galaxies receive com- 
pletely wrong redshift assignments with Az > 0.1 (left). For 
70% of the galaxies the true error distribution matches up with 
the one expected from the multi-color errors, but 30% of the 
objects have true errors larger than the estimated 3-er-errors 
(right), which are mostly starburst galaxies. The reason for the 
increased scatter in general is, that the observed SEDs are not 
perfectly matched by the library SEDs. 
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Fig. 13. The estimation quality for the redshift in the spectro- 
scopic quasar sample is shown in this plot of multi-color red- 
shift vs. spectroscopic redshift. The highest redshift quasar re- 
sides at z k, 3.7. Half of the quasars are estimated within an 
error margin on the order of ±0.1, but the other half suffers 
from catastrophic mistakes. 



Among the 21 spectroscopic quasars only the least lumi- 
nous and most nearby object has no MEV+ redshift estimate. 
The performance of the redshift estimation among the remain- 
ing 20 objects is demonstrated in Fig.JlJ]. Half of the quasars 
are rather well estimated with a mean redshift error of +0.01 
and a z w 0.03 rms deviation. For the other half of the objects 
the redshift estimates are completely wrong. The problem here 
is not only the lack of detectable continuum features for low- 
redshift quasars, but especially the intrinsic long-term variabil- 
ity of quasars offsetting the magnitudes measured in the various 
bands depending on the actual epoch of observation. 

The simulations for the redshift estimation of quasars at 
i? = 22 agrees well with the cross check spectroscopy at z > 2, 
but it underestimates the error of low-redshift objects. Again, 
the key problem here most likely is variability that scrambles 
the real spectrophotometric data collected over several years, 
while the simulation assumes an instantaneous measurement 
of colors by taking them from the color library. 

5.4. Potential improvements of the classification 

The classification works essentially well for ~ 99% of the ob- 
jects with useful magnitudes. Among the exceptions are some 
objects which appear strange to the classifier, although they 
belong to physically common object classes. Our scheme of 
classification is based on three fundamental ingredients: data, 
library and classifier. Improvements to the capabilities of the 
classified catalog could thus be achieved in the following form: 

1 . Improve the data by observing the missing bands, or if de- 
sired add filters or go deeper. From the missing bands we 
expect a more accurate redshift estimation for some objects 
and a slightly smaller fraction of unclassifiable objects. If 
the data were collected within a shorter period of time, the 
influence of variability could be reduced. But variable ob- 



jects are in CADIS anyway detected by repeated R-band 
observations during many runs. 

2. Improve the library by, e.g., adding some mixed templates 
for active galactic nuclei of low luminosity, adding a cou- 
ple more stars at the low-temperature end, increasing the 
spread of metallicities among M stars. It is not obvious how 
to best account for all oddities imaginable among quasars 
and how to account for their variability when it comes to 
redshift estimates. Especially, it is not clear whether an en- 
largement of the quasar parameter space would improve the 
result. Having the phenomenon not under perfect control an 
approach as simple as ours might be better. 

3. Improve the classifier and estimator. This seems to be an is- 
sue basically at the faint level, where the number of classi- 
fication mistakes becomes significant. The mistakes could 
be reduced, e.g. by taking class richness into account for 
the membership probabilities (which we do anyway for the 
unclassified objects by simply taking them all as galaxies), 
but rare valid quasar candidates could then go lost as most 
likely galaxies. Eventually, at the limits of the survey per- 
formance different applications might require different ap- 
proaches. 

6. Summary 

In paper I an innovative method for identifying stars, galax- 
ies and quasars in multi-color surveys was presented, which 
uses a library of i; 65000 color templates for comparison with 
observed objects. The method aims for extracting the informa- 
tion content of object colors in a statistically correct way, and 
performs a classification as well as a redshift estimation for 
galaxies and quasars in a unified approach based on the same 
probability density functions. 

The three basic ingredients to this method are 
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1 . accurately measured and calibrated color data for the ob- 
jects to be classified including the color errors 

2. accurate and representative color template libraries cover- 
ing the range of objects expected in the dataset obtained 

3. a statistical classifier and parameter estimator which can 
potentially be trimmed for best performance in particular 
applications. 

Also, in paper I it was concluded that medium-band surveys 
are expected to deliver a performance superior to pure broad- 
band surveys even under the constraint of equally limited tele- 
scope time. Based on survey simulations this method should be 
capable of 

1. separating stars from galaxies down to deeper limits than 
possible by using morphology only 

2. finding quasars also at redshifts where its colors overlap 
with the stellar locus in two-color projections, particularly 
in the range of 2.2 < z < 3.5 

3. selecting quasar candidates without reference to their mor- 
phology so that objects of low-luminosity with resolved 
hosts would not be excluded from the sample obtained 

4. using multi-color redshifts from medium-band surveys di- 
rectly and without spectroscopic follow-up for statistical in- 
vestigations, e.g. for studying the evolution of galaxy pop- 
ulations with redshift. 

In this paper, we applied the classification scheme to a real 
multi-color dataset provided by the Calar Alto Deep Imaging 
Survey (CADIS) investigating its reliability and accuracy. The 
dataset is 80% complete on three CADIS fields presented here, 
the lh-, the 9h- and 16h-field. Some filters on some fields are 
still missing and observations of some others have not reached 
the desired depth, yet. In any case, the CADIS filterset was 
tailored to the requirements of the emission line survey and is 
not an optimal choice for a general multi-color classification. 

The libraries of stars, galaxies and quasars presented in pa- 
per I are sufficient to classify all but a handful of unusual ob- 
jects in the CADIS dataset as the fields do not contain signifi- 
cant numbers of objects missing in the libraries. This is demon- 
strated by the low fraction (< 1%) of strange objects in the 
sample at R < 24. Our classification differentiates between 
unclassified objects that can not securely be decided among the 
alternatives and strange objects that are outliers in color space 
inconsistent with any library object. We note that unclassified 
objects are typically not strange and vice versa. 

The classified subsamples of stars and galaxies agree with 
expected numbers and the unusually high content of quasars at 
z > 2 was confirmed by spectroscopic observations (Wolf et 
al. 1999). The fraction of unclassified objects is less than 1% 
at R < 22 and reaches about 50% at R = 24 due to increas- 
ing photon noise that tends to make different original SEDs 
equally likely sources of the observed colors. At some level 
in between the classification becomes incomplete and also fea- 
tures an increasing fraction of actual mistakes. These are ba- 
sically all members of the rich galaxy population spilling over 
into regions of color space that is usually occupied by stars and 
quasars. 



A spectroscopic cross check using 162 identifications con- 
firmed the multi-color classification to work essentially free of 
errors at R < 22 (two mistakes among 103 objects). We do 
not have proper knowledge about where and how the classi- 
fication collapses exactly, which is particularly important for 
the rare quasar class which will become incomplete and dom- 
inated by contaminating galaxies at some level. To settle this 
uncertainty, dedicated spectroscopic observations are required. 
These would be a valuable and not too time-consuming invest- 
ment, if a large number of spectra from a CADIS size field 
could be taken simultaneously with an instrument like VMOS 
at the VLT. 

The findings on the classification performance are consis- 
tent with the expectations raised by the Monte-Carlo simula- 
tions of the CADIS filter set. They suggest the classification to 
work nearly perfect down to R « 23 except for some contam- 
ination of quasar candidates by emission line galaxies towards 
that limit. The simulations actually appear to be too optimistic 
about the working depth of the classification by about a third of 
a magnitude. Of course, a real survey will perform worse than 
a simulation, that can not account for differences between the 
real world and our library and does not contain artifacts and 
variable or blended objects. The latter issues have to be dealt 
with better data analysis. 

For galaxies, our multi-color redshifts are useful down to 
R ~ 24. The statistics on the redshift errors are dominated 
by ~ 10% catastrophic mistakes, where the estimator decides 
for the wrong one among alternative values with comparable 
probability. The core of the error distribution has a zero mean 
error and <r z w 0.03 rms width. Quiescent galaxies tend to 
work better than starburst types. 

Half of the quasars receive remarkably correct estimates 
with an average variance of <r z w 0.1, preferentially the z > 2- 
objects and those of higher luminosity. In contrast, a large 
amount of redshift confusion is expected at lower redshift. 
Also, simulations for quasars work better than the real redshift 
estimation, probably mostly due to variability. 

Eventually, the classified multi-color catalog has been used 
for several dedicated studies published in separate papers: 

1. The star-galaxy separation with the multi-color data 
reaches deeper than a morphological separation. Phleps et 
al. (2000) could therefore use the sample of stars to probe 
the stellar content and the Galactic structure along the pen- 
cil beams established by the CADIS fields, where they find 
strong evidence of a thick disk. 

2. The multi-color redshifts of galaxies are sufficiently accu- 
rate for most statistical studies. Also, the classification in- 
cludes a substantial fraction of compact galaxies into the 
sample. Using that, Fried et al. (2000) investigated the evo- 
lution of galaxies within 0.3 < z < 1, finding evidence 
that bf strong evolution takes place only among starburst 
objects. 

3. The multi-color data allow to identify quasars rather free of 
contaminants. Spectroscopy is not required to clean a sam- 
ple obtained by the classification. The selection should also 
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be rather complete, since CADIS finds an unexpectedly 
high density of z > 2-quasars. Many of these might reside 
close to the stellar locus and be overlooked in broad-band 



surveys (Wolf et al. 1999). Thus, quasars can be identified 



with a much more uniform completeness across the acces- 
sible redshift range and more homogeneous samples can 
be obtained. At z > 2, it will be possible to constrain the 
evolution of the luminosity function from a large enough 
multi-color sample. 
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Appendix A: Notes on peculiar objects 

Here, we would like to discuss in detail some objects with pe- 
culiar properties making them a non-trivial case for the classi- 
fication. Some of these are also classified as strange, while the 
peculiarity of others was only revealed by taking a closer look: 

- The bluest object of R < 24 in the present dataset is a 
compact source in the 16h-field with B w 22.8 and R « 
22.95 which is classified as a star. An observation of such 
a blue star at this magnitude level would more plausibly be 
explained by a white dwarf with My ~ 11 residing at a 
distance of about 2.5 kpc than by a B7V star of My ~ — 1 
that had to be far out at 600 kpc distance. In fact, the object 
was confirmed as a white dwarf on the basis of its broad 
H(3 absorption line during the first spectroscopic observing 
run. 

- The reddest object of / < 23 is also a compact source in 
the 16h-field with R > 26, I « 22.5 and K' » 18.6. It 
is classified as a galaxy at z s» 1 having a very red SED, 
but spectroscopically we identified it as an LI star (Wolf 
et al. 1998). Obviously, the best fitting red galaxy template 
matched the observed colors better than the red-most star 
available in the Pickles library which is in fact of type M8. 
So, the problem here is an insufficient library lacking L- 
type stars. 

- The most nearby extragalactic object confirmed is a galaxy 
in the 16h-field residing at z = 0.035. It is classified as a 
galaxy with a redshift estimate of z = 0.031 ± 0.003. The 
redshift is mostly constrained by an emission line showing 
up in the medium-band filter 522/15 (i.e. OIII) in com- 
bination with the absence of any continuum drop within 
the filter set to be caused by a 4000 A-break. Upon com- 
pletion of the full data set, we should also see an Ha-line 
in the filter 683/18. Since the blue side of the spectrum 
is somewhat sparsely sampled by our filters, we expected 
problems for an accurate redshift estimate of galaxies at 
z < 0.2, which appear to be not so common for emis- 
sion line galaxies. The redshift yields a distance of 210Mpc 



with Hq = 50km/(sMpc). Using an aperture corrected to- 
tal magnitude of B « 21.85, this galaxy has a rather mod- 
erate luminosity of Mb ~ —14.8. 

- The Olh-field contains a compact source of R = 23.4 
which appears in the filter 815/20 about three magnitudes 
brighter than in the neighboring bands. It is classified as 
a rather strange galaxy with z p i lot — 1.78. Although, we 
do not have a spectrum of this object, yet, we can proba- 
bly clarify its nature: In fact, the high flux in the 815/20 
filter is consistent with a very strong emission line seen by 
the Fabry-Perot observations ranging from 814 to 824 nm. 
While the filter 815/20 suggests a total equivalent width for 
the contained lines of - 2700 ± 600 A, a line fit to the 
FPI fluxes yields A celi = 815.0 nm and an equivalent width 
of - 1700 ± 500 A. In addition, we see the filter 611/16 
brightened by an emission line and a multi-color spectrum 
suggestive of a z « 0.4 . . . 0.7-galaxy. 

The only consistent picture is an interpretation of the line 
in the FPI as OIII 5007 placing the object at z = 0.628. 
The line in the 611/16 filter is then Oil 3727, while the 
line O III 4959 is not covered by the FPI wavelength range, 
but contributes to the total flux in the filter 815/20, lift- 
ing the total equivalent width of the line pair up to ~ 
2300 ± 650 A. Physically, this means the object is emitting 
~ 2 x 10 42 erg/s in its O III line from a giant H II region 
or nuclear starburst. According to Balzano (1983), about 
3% of all field galaxies with Mb < —17.5 in the local 
universe show nuclear starbursts with more than 10 40 erg/s 
flux in the Ha-line, but less than 0.01% have Ha-fluxes 
above 10 42 erg/s. While the flux ratio of O III and Ha in 
her sample shows quite a spread, Ha tends to be stronger. 
We therefore conclude, that we most likely see one of those 
rare objects here. In our present sample of emission-line 
galaxies found with the FPI, this object has the most lumi- 
nous line flux. 

- Our only present z > 4-quasar candidate was identified 
by spectroscopy as an emission line galaxy at z = 0.265. 
The object of R = 22.9 is classified as a strange quasar 
with Zphot = 4.12, mostly because of its red B-R color 
and blue R-I colors in combination with a strong emission 
line seen in the filter 628/16 which also contributes also to 
the R-band flux. If the strong line flux (equivalent width 
iS 1200 A) is taken into account, the true level of the R- 
band continuum can be recovered and matched into a nor- 
mal nearby blue galaxy spectrum. A pure broad-band sur- 
vey would supposedly have picked this object as a z > 4- 
quasar candidate. But our method produced it as a quasar 
candidate, too, although CADIS provided all relevant pho- 
tometry to identify it as a low-redshift emission-line object. 

- The most nearby active galaxy in our present sample is a 
Seyfert-1 galaxy at z = 0.474 found by chance in the 16h- 
field. It is the least luminous AGN we found, having Mb = 
-21 .7 deri ved from R = 20 for H = 50 km/(s Mpc) and 
qo = 0.5 (Wolf et al. 1999). It is compact in appearance 



and classified as a strange starburst galaxy. The strangest 
part of its multi-color spectrum is a rather bright K-band 
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magnitude of K' = 16.3 indicating an NIR excess. In fact, 
at lower luminosities and redshifts we have to expect quite 
some incompleteness in searching for Seyfert-1 galaxies. 
First, our libraries do not contain any NIR excess over a 
pure power law and second, spectra of faint Seyferts are 
likely to be a superposition of a host galaxy with an active 
nucleus requiring some composite templates. 

- Another Seyfert-1 galaxy confirmed at z = 1.22 in the 9h- 
field having R — 21.2 and Mb = —22.2 is classified as 
a strong starburst galaxy from the present dataset. How- 
ever, when fewer filters were available it was classified and 
found as a quasar. This is our only case, where a significant 
enlargement of available color information removed a good 
and proven candidate from the correct class. 

- Only one of the quasars we identified appears extended. It 
resides at z = 1.57 in the 16h-field having B = 21.5 and 
Mb = —23.4. We believe to have resolved a luminous host 
galaxy, since the redshift estimates for both the galaxy and 
the quasar class coincide with Zph t = 1.83 ± 0.03 and 
Zphot = 1-63 ± 0.01, respectively. An alternative interpre- 
tation is, that we are just looking at a chance projection of 
the quasar with a foreground object, but our spectrum of 
this object is too noisy to look more closely into that. 
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